154 research outputs found

    Reinforcement learning-based approximate optimal control for attitude reorientation under state constraints

    Get PDF
    This paper addresses the attitude reorientation problems of rigid bodies under multiple state constraints. A novel reinforcement learning (RL)-based approximate optimal control method is proposed to make the trade-off between control cost and performance. The novelty lies in that it guarantees constraint handling abilities on attitude forbidden zones and angular-velocity limits. To achieve this, barrier functions are employed to encode the constraint information into the cost function. Then an RL-based learning strategy is developed to approximate the optimal cost function and control policy. A simplified critic-only neural network (NN) is employed to replace the conventional actor-critic structure once adequate data is collected online. This design guarantees the uniform boundedness of reorientation errors and NN weight estimation errors subject to the satisfaction of a finite excitation condition, which is a relaxation compared with the persistent excitation condition that is typically required for this class of problems. More importantly, all underlying state constraints are strictly obeyed during the online learning process. The effectiveness and advantages of the proposed controller are verified by both numerical simulations and experimental tests based on a comprehensive hardware-in-loop testbed

    Optimal tracking control for uncertain nonlinear systems with prescribed performance via critic-only ADP

    Get PDF
    This paper addresses the tracking control problem for a class of nonlinear systems described by Euler-Lagrange equations with uncertain system parameters. The proposed control scheme is capable of guaranteeing prescribed performance from two aspects: 1) A special parameter estimator with prescribed performance properties is embedded in the control scheme. The estimator not only ensures the exponential convergence of the estimation errors under relaxed excitation conditions but also can restrict all estimates to pre-determined bounds during the whole estimation process; 2) The proposed controller can strictly guarantee the user-defined performance specifications on tracking errors, including convergence rate, maximum overshoot, and residual set. More importantly, it has the optimizing ability for the trade-off between performance and control cost. A state transformation method is employed to transform the constrained optimal tracking control problem to an unconstrained stationary optimal problem. Then a critic-only adaptive dynamic programming algorithm is designed to approximate the solution of the Hamilton-Jacobi-Bellman equation and the corresponding optimal control policy. Uniformly ultimately bounded stability is guaranteed via Lyapunov-based stability analysis. Finally, numerical simulation results demonstrate the effectiveness of the proposed control scheme

    A deep-learning approach for reservoir evaluation for shale gas wells with complex fracture networks

    Get PDF
    The complex fracture networks in shale gas reservoirs bring greater challenges and uncertainties to the modeling in reservoir evaluation. As the  emerging potential technology, deep learning can be usefully applied to many aspects of reservoir evaluation. To further conduct the reservoir evaluation in rate transient analysis, this work proposes a data-driven proxy model for accurately evaluating the horizontal wells with complex fracture networks in shales. The production time, variable bottom hole pressure, and the fracture networks properties are used as input variables, while the output variable refers to the production for the forecast time period. The data from boundary element method is used to generate the proxy model for the learning process. The method of shuffled cross-validation is used to increase the model’s accuracy and generalizability. The proxy model is coupled with recently developed deep learning methods such as attention mechanism, skip connection, and cross-validation to address the time series analysis problem for multivariate operating and physical parameters. Results demonstrate that the attention mechanism is robust. The operating parameters analysis shows that the attention mechanism has the ability to analyze variable pressure drop/flowrate data. Sensitivity analysis also indicates that the model takes into account the geometric characteristics of fracture network. The model reliability is proved by a case study from Marcellus shale. The computation time of the trained attention mechanism model is approximately 0.3 s, which equates to 3.8% of the physical model’s running time.Cited as: Chu, H., Dong, P., Lee, W. J. A deep-learning approach for reservoir evaluation for shale gas wells with complex fracture networks. Advances in Geo-Energy Research, 2023, 7(1): 49-65. https://doi.org/10.46690/ager.2023.01.0

    An improved method for predicting CO2 minimum miscibility pressure based on artificial neural network

    Get PDF
     The CO2 enhanced oil recovery (EOR) method is widely used in actual oilfields. It is extremely important to accurately predict the CO2 minimum miscibility pressure (MMP) for CO2-EOR. At present, many studies about MMP prediction are based on empirical, experimental, or numerical simulation methods, but these methods have limitations in accuracy or computation efficiency. Therefore, more work needs to be done. In this work, with the results of the slim-tube experiment and the data expansion of the multiple mixing cell methods, an improved artificial neural network (ANN) model that predicts CO2 MMP by the full composition of the crude oil and temperature is trained. To stabilize the neural network training process, L2 regularization and Dropout are used to address the issue of over-fitting in neural networks. Predicting results show that the ANN model with Dropout possesses higher prediction accuracy and stronger generalization ability. Then, based on the validation sample evaluation, the mean absolute percentage error and R-square of the ANN model are 6.99 and 0.948, respectively. Finally, the improved ANN model is tested by six samples obtained from slim-tube experiment results. The results indicate that the improved ANN model has extremely low time cost and high accuracy to predict CO2 MMP, which is of great significance for CO2-EOR.Cited as: Dong, P., Liao, X., Chen, Z., Chu, H. An improved method for predicting CO2 minimum miscibility pressure based on artificial neural network. Advances in Geo-Energy Research, 2019, 3(4): 355-364, doi: 10.26804/ager.2019.04.0

    Wind-farm power tracking via preview-based robust reinforcement learning

    Get PDF
    This paper aims to address the wind-farm power tracking problem, which requires the farm's total power generation to track time-varying power references and therefore allows the wind farm to participate in ancillary services such as frequency regulation. A novel preview-based robust deep reinforcement learning (PR-DRL) method is proposed to handle such tasks which are subject to uncertain environmental conditions and strong aerodynamic interactions among wind turbines. To our knowledge, this is for the first time that a data-driven model-free solution is developed for wind-farm power tracking. Particularly, reference signals are treated as preview information and embedded in the system as specially designed augmented states. The control problem is then transformed into a zero-sum game to quantify the influence of unknown wind conditions and future reference signals. Built upon the HH_\infty control theory, the proposed PR-DRL method can successfully approximate the resulting zero-sum game's solution and achieve wind-farm power tracking. Time-series measurements and long short-term memory (LSTM) networks are employed in our DRL structure to handle the non-Markovian property induced by the time-delayed feature of aerodynamic interactions. Tests based on a dynamic wind farm simulator demonstrate the effectiveness of the proposed PR-DRL wind farm control strategy

    Reinforcement learning-based wind farm control : towards large farm applications via automatic grouping and transfer learning

    Get PDF
    The high system complexity and strong wake effects bring significant challenges to wind farm operations. Conventional wind farm control methods may lead to degraded power generation efficiency. A reinforcement learning (RL)-based approach is proposed in this paper to handle these issues, which can increase the long-term farm-level power generation subject to strong wake effects while without requiring analytical wind farm models. The proposed method is significantly distinct from existing RL-based wind farm control approaches, whose computational complexities usually increase heavily with the increase of total turbine numbers. In contrast, our method can greatly reduce training loads and enhance learning efficiency via two novel designs: (1) automatic grouping and (2) multi-agent-based transfer learning (MATL). Automatic Grouping can divide a large wind farm into small turbine groups by analyzing the aerodynamic interactions between turbines and utilizing some key principles from the graph theory. It enables the separated conduction of RL algorithms on small turbine groups, avoiding the complex training process and high computational costs of applying RL on the entire farm. Based on Automatic Grouping, MATL can further reduce the computational complexity by allowing agents (i.e. wind turbines) to inherit control policies under potential group changes. Case studies with a dynamical simulator show that the proposed method achieves clear power generation increases than the benchmark. It also dramatically reduces computational costs compared with typical RL-based wind farm control methods, paving the way for the application of RL in general wind farms

    Data-driven wind farm control via multi-player deep reinforcement learning

    Get PDF
    This brief paper proposes a novel data-driven control scheme to maximize the total power output of wind farms subject to strong aerodynamic interactions among wind turbines. The proposed method is model-free and has strong robustness, adaptability and applicability. Particularly, distinct from state-of-the-art data-driven wind farm control methods that commonly employ the steady-state or time-averaged data (such as turbines' power outputs under steady wind conditions or from steady-state models) to carry out learning, the proposed method directly mines in-depth the time-series data measured at turbine rotors under time-varying wind conditions to achieve farm-level power maximization. The control scheme is built on a novel multi-player deep reinforcement learning method (MPDRL), in which a special critic-actor-distractor structure, along with deep neural networks (DNNs), is designed to handle the stochastic feature of wind speeds and learn optimal control policies subject to a user-defined performance metric. The effectiveness, robustness and scalability of the proposed MPDRL-based wind farm control method are tested by prototypical case studies with a dynamic wind farm simulator. Compared with the commonly employed greedy strategy, the proposed method leads to clear increases in farm-level power generation in case studies

    Dual-Quaternion-Based Fault-Tolerant Control for Spacecraft Tracking With Finite-Time Convergence

    Get PDF
    Results are presented for a study of dual-quaternion-based fault-tolerant control for spacecraft tracking. First, a six-degrees-of-freedom dynamic model under a dual-quaternion-based description is employed to describe the relative coupled motion of a target-pursuer spacecraft tracking system. Then, a novel fault-tolerant control method is proposed to enable the pursuer to track the attitude and the position of the target even though its actuators have multiple faults. Furthermore, based on a novel time-varying sliding manifold, finite-time stability of the closed-loop system is theoretically guaranteed, and the convergence time of the system can be given explicitly. Multiple-task capability of the proposed control law is further demonstrated in the presence of disturbances and parametric uncertainties. Finally, numerical simulations are presented to demonstrate the effectiveness and advantages of the proposed control method

    Intelligent wind farm control via grouping-based reinforcement learning

    Get PDF
    This paper aims to maximize the total power generation for wind farms subject to strong wake effects and stochastic inflow wind speeds. A data-driven control method that only requires the accessible measurements of every turbine in the farm is proposed via deep reinforcement learning (DRL). We employ a grouping strategy to mitigate the high computational complexity induced by DRL and enhance our method’s applicability to large-scale wind farms. Based on the levels of aerodynamic interactions among turbines, this grouping strategy divides the whole farm into small sub-groups. Therefore, one can execute DRL on these sub-groups instead of carrying on a complicated learning process for the entire farm. Simulations verify the advantages of the proposed DRL-based wind farm control method over the commonly employed greedy strategy. Results also show that the proposed method can significantly reduce the overall computing cost compared with the direct execution of DRL on the whole wind farm

    Composite experience replay based deep reinforcement learning with application in wind farm control

    Get PDF
    In this article, a deep reinforcement learning (RL)-based control approach with enhanced learning efficiency and effectiveness is proposed to address the wind farm control problem. Specifically, a novel composite experience replay (CER) strategy is designed and embedded in the deep deterministic policy gradient (DDPG) algorithm. CER provides a new sampling scheme that can mine the information of stored transitions in-depth by making a tradeoff between rewards and temporal difference (TD) errors. Modified importance-sampling weights are introduced to the training process of neural networks (NNs) to deal with the distribution mismatching problem induced by CER. Then, our CER-DDPG approach is applied to optimizing the total power production of wind farms. The main challenge of this control problem comes from the strong wake effects among wind turbines and the stochastic features of environments, rendering it intractable for conventional control approaches. A reward regularization process is designed along with the CER-DDPG, which employs an additional NN to handle the bias of rewards caused by the stochastic wind speeds. Tests with a dynamic wind farm simulator (WFSim) show that our method achieves higher rewards with less training costs than conventional deep RL-based control approaches, and it has the ability to increase the total power generation of wind farms with different specifications
    corecore